Simple API - working with JSON

We are going to work with JSON files that come from what are called public APIs (APIs that anyone can interact with). Loading packages:

library("jsonlite")
library("tidyverse")
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter()  masks stats::filter()
## ✖ purrr::flatten() masks jsonlite::flatten()
## ✖ dplyr::lag()     masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Now let’s move on to JSON files we can access directly from the web, via a simple public API. Essentially, we are able to input a URL into the fromJSON function, and read whatever JSON file is returned. For now, we are just going to query a public API that has a single non-variable endpoint which returns (in this case) a random fact about cats:

api_url <- "https://catfact.ninja/fact"

fromJSON(api_url)
## $fact
## [1] "The life expectancy of cats has nearly doubled over the last fifty years."
## 
## $length
## [1] 73

Because this is such a simple API, if we wanted to return more results, we’d need to get a little creative. Note how this time we include, within our do.call an lapply across the elements of our output list, converting each element into a tibble with as_tibble.

api_queryer <- function(results_count = 10, api_url = "https://catfact.ninja/fact"){
  
  run_result <- c()
  results <- list()

  for(i in 1:results_count){
    results[[i]] <- fromJSON(api_url)  
    Sys.sleep(1)
  }
  
  return(results)
}

# run our function -- we can leave the defaults
cat_facts_output <- api_queryer()

# convert to a nicely formatted tibble (though in this case it would be ideal if it were a kibble -- this is a cat joke):
cat_facts <- do.call(rbind, lapply(cat_facts_output, as_tibble, stringsAsFactors = FALSE))

cat_facts
## # A tibble: 10 × 2
##    fact                                                                   length
##    <chr>                                                                   <int>
##  1 If your cat snores, or rolls over on his back to expose his belly, it…     90
##  2 The cat has 500 skeletal muscles (humans have 650).                        51
##  3 Baking chocolate is the most dangerous chocolate to your cat.              61
##  4 Two members of the cat family are distinct from all others: the cloud…    354
##  5 Neutering a cat extends its life span by two or three years.               60
##  6 Cats are the world's most popular pets, outnumbering dogs by as many …     84
##  7 A cat rubs against people not only to be affectionate but also to mar…    174
##  8 A cat has the ability to rotate their ears 180 degrees,with the help …    113
##  9 A cat’s jaw can’t move sideways, so a cat can’t chew large chunks of …     74
## 10 Cats see six times better in the dark and at night than humans.            63

AIC API

API secret keys and environment

Now let’s say we’re working with more intricate APIs, where we need to request API access from the platform. If our request is accepted, then we would get an API key and a client ID (typically). It is usually a condition for getting an API key that it be kept secure and its usage restricted to the individual researcher’s use. That implies that you cannot hardcode your API key into your script and you cannot share it on e.g. Github. But you still need R to know it so you can run and knit your code. The best practice to deal with this is to store your API key information locally in a local environment file.

# The below command create an .Renviron file to store locally on your computer but never share on Github or anywhere
usethis::edit_r_environ()
## ☐ Edit '/Users/poirot/.Renviron'.
## ☐ Restart R for changes to take effect.

Once you’ve created an .Renviron file you can populate it with your very own secret API key and client. It might look something like this:

NAME="Marion Lieutaud"
APIKEY="xxxxxxxxx"

Then save your .Renviron file and restart RStudio Once you’ve done that, you should be able to access your environment data in the following way

Sys.getenv('NAME')
## [1] ""
# This should return "Marion Lieutaud" in my case

Now when

Artwork Search API

Loading packages:

library("httr")
library("jsonlite")
library("tidyverse")
library("jpeg") #to let us read .jpegs/.jpgs
library("grid") #to let us plot images

Our cat facts API is cute but it’s not terribly useful unless we want to pull “random” objects out of an API pipeline, which probably is not the case. So instead, let’s explore a more useful public API, from the Art Institute of Chicago. This API has multiple models or “resources” (essentially, representations of the underlying data that exist in some relational databases somewhere – more next week), each of which can be queried via three endpoints.

Much like our cat facts API, we can just do a direct call to the base URL, which corresponds in this case to the listings endpoint. Queries to this endpoint return pages from all listings of the AIC collection. In this case, by default get back the first page of results only, and we get a lot of data for each artwork or each artist (depending on which model we query):

artworks_url <- "https://api.artic.edu/api/v1/artworks"

# fromJSON(artworks_url)

artists_url <- "https://api.artic.edu/api/v1/artists"

# fromJSON(artists_url)

Let’s focus, for now, on the artworks model. As we just saw, our query produced a large number of columns (“fields”), many of which we don’t really want or need. Consulting the documentation, and using what we know about the structure of URLs, we see that we can specify fields for our query:

artworks_url_fields <- "https://api.artic.edu/api/v1/artworks?fields=id,title,artist_display,date_display"

fromJSON(artworks_url_fields)
## $pagination
## $pagination$total
## [1] 127616
## 
## $pagination$limit
## [1] 12
## 
## $pagination$offset
## [1] 0
## 
## $pagination$total_pages
## [1] 10635
## 
## $pagination$current_page
## [1] 1
## 
## $pagination$next_url
## [1] "https://api.artic.edu/api/v1/artworks?page=2&fields=id%2Ctitle%2Cartist_display%2Cdate_display"
## 
## 
## $data
##       id                                          title
## 1  14620                        Cliff Walk at Pourville
## 2  15857                                  Cabaret Scene
## 3  15854                             Seated Female Nude
## 4  20684                        Paris Street; Rainy Day
## 5  18579                                       Chickens
## 6  21954                      Bird-Shaped Water Dropper
## 7  21893                       Bamboo Shoot-Shaped Ewer
## 8  22525            Bird Shaped Ewer with Daoist Priest
## 9  22191                               Claudine Resting
## 10 24202                                        La Java
## 11 24306                           Blue and Green Music
## 12 27949 Madame Roulin Rocking the Cradle (La berceuse)
##                                   date_display
## 1                                         1882
## 2                                      c. 1920
## 3                                      c. 1925
## 4                                         1877
## 5                                         1933
## 6  Goryeo dynasty (918–1392), mid–12th century
## 7      Goryeo dynasty (918–1392), 12th century
## 8      Goryeo dynasty (918–1392), 12th century
## 9                                         1913
## 10                                        1925
## 11                                     1919–21
## 12                                        1889
##                                      artist_display
## 1                  Claude Monet (French, 1840–1926)
## 2                    André Lhote\nFrench, 1885-1962
## 3                    André Lhote\nFrench, 1885-1962
## 4           Gustave Caillebotte (French, 1848–1894)
## 5                 Edgar Miller\nAmerican, born 1899
## 6                                             Korea
## 7                                             Korea
## 8                                             Korea
## 9  Jules Pascin\nAmerican, born Bulgaria, 1885-1930
## 10           Georges Emile Capon\nFrench, 1890-1980
## 11           Georgia O'Keeffe (American, 1887–1986)
## 12              Vincent van Gogh (Dutch, 1853–1890)
## 
## $info
## $info$license_text
## [1] "The `description` field in this response is licensed under a Creative Commons Attribution 4.0 Generic License (CC-By) and the Terms and Conditions of artic.edu. All other data in this response is licensed under a Creative Commons Zero (CC0) 1.0 designation and the Terms and Conditions of artic.edu."
## 
## $info$license_links
## [1] "https://creativecommons.org/publicdomain/zero/1.0/"
## [2] "https://www.artic.edu/terms"                       
## 
## $info$version
## [1] "1.13"
## 
## 
## $config
## $config$iiif_url
## [1] "https://www.artic.edu/iiif/2"
## 
## $config$website_url
## [1] "http://www.artic.edu"

Now let’s switch to a different endpoint, the detail endpoint where we can request information on specific artworks. We’re sitll using the artworks model, and again we’ll only query specific fields for the artwork(s) of interest. We’ll start to build this up in a slightly more principled fashion, using paste0(), which concatenates strings. Below, the first string in our paste0() function is the artworks_url model URL we defined above, the second string is some required formatting, the third string is the specific artwork of interest (can you figure out which artwork it is?), and the third string is the specific set of fields we want.

# define our fields of interest
fields <- "?fields=id,title,artist_display,date_display"

# provide an artwork to study
artwork <- "28560"

# build the query and retrieve JSON
artwork_detail_url <- paste0(artworks_url, "/", artwork, fields)

fromJSON(artwork_detail_url)
## $data
## $data$id
## [1] 28560
## 
## $data$title
## [1] "The Bedroom"
## 
## $data$date_display
## [1] "1889"
## 
## $data$artist_display
## [1] "Vincent van Gogh (Dutch, 1853–1890)"
## 
## 
## $info
## $info$license_text
## [1] "The `description` field in this response is licensed under a Creative Commons Attribution 4.0 Generic License (CC-By) and the Terms and Conditions of artic.edu. All other data in this response is licensed under a Creative Commons Zero (CC0) 1.0 designation and the Terms and Conditions of artic.edu."
## 
## $info$license_links
## [1] "https://creativecommons.org/publicdomain/zero/1.0/"
## [2] "https://www.artic.edu/terms"                       
## 
## $info$version
## [1] "1.13"
## 
## 
## $config
## $config$iiif_url
## [1] "https://www.artic.edu/iiif/2"
## 
## $config$website_url
## [1] "http://www.artic.edu"
# to show only the data we want
fromJSON(artwork_detail_url)$data
## $id
## [1] 28560
## 
## $title
## [1] "The Bedroom"
## 
## $date_display
## [1] "1889"
## 
## $artist_display
## [1] "Vincent van Gogh (Dutch, 1853–1890)"

The next endpoint is perhaps the most interesting for us: the search endpoint. This allows us to search the model of interest, and return only the data that results from that search. This is great because it lets us narrow down our requests and not overload the AIC’s servers, and because it lets us look for specific types of art (you can imagine how useful this would be in a social science application). In furtherance of our feline API efforts, let’s start by searching for artwork about cats:

# artworks model, search endpoint url:
artworks_search_url <- "https://api.artic.edu/api/v1/artworks/search?q="

# define search terms. we use gsub(" ", "%20", "x") here to replace spaces between search terms with "%20" which is how we often represent spaces in a URL. 
search_terms <- gsub(" ", "%20", "cat")

# build the query:
cat_search_url <- paste0(artworks_search_url, search_terms)

fromJSON(cat_search_url)
## $preference
## NULL
## 
## $pagination
## $pagination$total
## [1] 7869
## 
## $pagination$limit
## [1] 10
## 
## $pagination$offset
## [1] 0
## 
## $pagination$total_pages
## [1] 787
## 
## $pagination$current_page
## [1] 1
## 
## 
## $data
##       _score     id api_model                                     api_link
## 1  135.57333    656  artworks    https://api.artic.edu/api/v1/artworks/656
## 2  119.93909 117241  artworks https://api.artic.edu/api/v1/artworks/117241
## 3  116.97719  45259  artworks  https://api.artic.edu/api/v1/artworks/45259
## 4  104.34673  16227  artworks  https://api.artic.edu/api/v1/artworks/16227
## 5   97.81632  22482  artworks  https://api.artic.edu/api/v1/artworks/22482
## 6   94.19799  51719  artworks  https://api.artic.edu/api/v1/artworks/51719
## 7   92.74848 158921  artworks https://api.artic.edu/api/v1/artworks/158921
## 8   92.40688 119335  artworks https://api.artic.edu/api/v1/artworks/119335
## 9   92.03020  68825  artworks  https://api.artic.edu/api/v1/artworks/68825
## 10  89.77347   5522  artworks   https://api.artic.edu/api/v1/artworks/5522
##    is_boosted                                          title
## 1       FALSE           Lion (One of a Pair, South Pedestal)
## 2       FALSE                                  Girl with Cat
## 3       FALSE                                 Nude with Cats
## 4       FALSE                                  Cat Making Up
## 5       FALSE                                   Homesickness
## 6       FALSE                       Winter: Cat on a Cushion
## 7       FALSE                   Courtesan Playing with a Cat
## 8       FALSE Baroque Pearl Mounted as a Cat Holding a Mouse
## 9       FALSE                           The Cats' Rendezvous
## 10      FALSE                                     Cat Coffin
##                                                                                                                                                                                                                                                                                                                                                                            thumbnail.lqip
## 1  data:image/gif;base64,R0lGODlhCAAFAPUAADY5NTQ/PkI4IT1COjtEQkNHQk5YWFteWWtkTmtoX291WnJ1WX1xZ355cYZ3Zn+Cf42KaoWEeJOKfpeWcoCEgI2OiJaQgp+XgpWRiJWTjKKbjqqhkK+olbGml7q0q7e2sMG5rcbAs8fBuMrGv8/HvsnJud3d3evr7gAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAAAAAAALAAAAAAIAAUAAAYlQJPnIwKRQpPTyPDgQDaC0gUQIAwsDISjcUgUMB2JpkLJRBSLIAA7
## 2                                                                                                                                                          data:image/gif;base64,R0lGODlhBAAFAPQAACUiHCckHSghGi4nGislHi0mHjMvJDUuJDUwITcwIz07LUY+LUhALktGNExGNFFINGhYOWdcQXVlRYRmSAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAAAAAAALAAAAAAEAAUAAAUR4EAwC3JMjxJJhgM1QSEkQAgAOw==
## 3                                                                                                                                                          data:image/gif;base64,R0lGODlhBAAFAPQAAFpcVVthVGBgSndvSW5hUWtlWHJvVWxpYHJ1bYd+an2Feo6Da4iGcZmTfqWdi6SgkK+rnLS7q7u4qb68rAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAAAAAAALAAAAAAEAAUAAAURoBIIQ8QQBlIsDXAkzjNJUAgAOw==
## 4                                                                                                                                                          data:image/gif;base64,R0lGODlhBAAFAPQAACAmNC0wPW1VOTw/SVlWUXBlVnV2doVnQ494WYx8aI+BboWAedipa4iOlYyQlI+UmaabkLKpmta2jKi52wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAAAAAAALAAAAAAEAAUAAAURoAJNw5IUQEMggWNIwhMxRwgAOw==
## 5                                                                                                                                                          data:image/gif;base64,R0lGODlhBAAFAPQAAJB4cbqIcIp/irSQgrWXhbScjbGZlqqdqqicr66iusCklcGpm7m0zb+409LH0NPF09XJ1tbL1tnM2d7P3wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAAAAAAALAAAAAAEAAUAAAURYCRBDtNMD5IIxxAARkEoSwgAOw==
## 6                                                                                                                                                  data:image/gif;base64,R0lGODlhBgAFAPQAAEI9JFFFLV5QPWVaOn1VR31rUHhpV3BnXHprWYBhRpJgUIdyU6VqW6pxXKR5WId9b458bcBvZcdwaM14btR3b9V8c9F/eI6Dc6uHeqKSfcORhtiVjdmWjt6ZkAAAAAAAACH5BAAAAAAALAAAAAAGAAUAAAUY4GUgxZI9QgAk2jE4CrNBhFVRHdZMkcSFADs=
## 7                                                                                                                                                                                                                                  data:image/gif;base64,R0lGODlhAwAFAPMAAK+RdLmlgryjhrCgiL2oj8KymsWzmMq2msvArMvBrNDBrNDGsdjLtNnOtwAAAAAAACH5BAAAAAAALAAAAAADAAUAAAQLMLFmRCvhDECWQhEAOw==
## 8                                                                                                                                                  data:image/gif;base64,R0lGODlhBgAFAPQAAKGJbLOYdbKafMGhdJ6fn6eWgLSgiLCnnKKhoaWjpKakpaelpqenp66wtbKwsbSztLW0tLi3uLe5v7y9vsasjsO6ssK9vMPAvsfBvMXBwMTCwsLCxcjIyO3m3wAAAAAAACH5BAAAAAAALAAAAAAGAAUAAAUYILMoCUJAjxMczXR1QyFtBiUAEYdVVqaFADs=
## 9                                                                                                                                                          data:image/gif;base64,R0lGODlhBAAFAPQAAC0nHFpSRH92Z4V+boyDdJOId6KWh6GYiaacjKidja6lkq6klLKolrmunb6zosO4psK6qMe6qNHEstPItwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAAAAAAALAAAAAAEAAUAAAURYIQs0zMojQBARxEwhOQkRggAOw==
## 10                                                                                                                                                                                                                                 data:image/gif;base64,R0lGODlhAwAFAPMAAGlQQXhiVG9yfZF9b4eAgJaKh4KGkKOXk6SZlJKWoaehoaKiqKmorbK5x+Tn8AAAACH5BAAAAAAALAAAAAADAAUAAAQL0BDBRkKhFbDcUREAOw==
##    thumbnail.width thumbnail.height
## 1             8430             5620
## 2             6486             7661
## 3             4603             5060
## 4             1645             2250
## 5             1754             2250
## 6             2928             2250
## 7             2039             3800
## 8             2911             2250
## 9             1767             2250
## 10            1500             2250
##                                                                                                                                    thumbnail.alt_text
## 1                       A bronze lion, deep green and muscular, looks out in the distance from its pedestal in front of the Art Institute of Chicago.
## 2                                                                                                                        A work made of oil on board.
## 3                                                                                                                    A work made of oil on cardboard.
## 4     A color woodblock print of two orangeand white cats with pink noses represented in cubist-like design against a blue, gray and black background
## 5                                                               A work made of watercolor and gouache, over touches of graphite, on cream wove paper.
## 6  A work made of lithograph in 6 colors (red, ochre, yellow, black, gray-brown, brown) from two stones, with scraping on stone, on ivory wove paper.
## 7                                                                                A work made of hand-colored woodblock print; tan-e, vertical o-oban.
## 8                                                                                                     A work made of gold, enamel, and baroque pearl.
## 9                                                                   A work made of lithograph in black on ivory wove paper, laid down on ivory cloth.
## 10                                                                                                                   A work made of wood and plaster.
##                    timestamp
## 1  2025-03-15T23:26:02-05:00
## 2  2025-03-15T22:39:07-05:00
## 3  2025-03-15T22:21:05-05:00
## 4  2025-03-15T22:14:04-05:00
## 5  2025-03-15T22:15:41-05:00
## 6  2025-03-15T22:22:54-05:00
## 7  2025-03-15T22:50:24-05:00
## 8  2025-03-15T22:39:41-05:00
## 9  2025-03-15T23:20:25-05:00
## 10 2025-03-15T22:11:34-05:00
## 
## $info
## $info$license_text
## [1] "The `description` field in this response is licensed under a Creative Commons Attribution 4.0 Generic License (CC-By) and the Terms and Conditions of artic.edu. All other data in this response is licensed under a Creative Commons Zero (CC0) 1.0 designation and the Terms and Conditions of artic.edu."
## 
## $info$license_links
## [1] "https://creativecommons.org/publicdomain/zero/1.0/"
## [2] "https://www.artic.edu/terms"                       
## 
## $info$version
## [1] "1.13"
## 
## 
## $config
## $config$iiif_url
## [1] "https://www.artic.edu/iiif/2"
## 
## $config$website_url
## [1] "http://www.artic.edu"

What we’ve done above is to use the logic of the AIC API to build particular URLs of interest, and then query them directly with fromJSON. Now let’s start writing queries in a slightly more elegant fashion, using the httr package (you can also use httr2, which is a newer re-write of httr that is in early versioning). These packages are high-level interfaces of curl, developed for flexible and customisable querying of web resources from R. For now, we will use the GET function from httr. Among other things, the GET function allows us to input the url of interest, an additional path (which we won’t use in this case), and a detailed query list which can take as many elements as there are parameters for our API.

# build the API GET request
cat_search <- GET(artworks_search_url, # the API endpoint of interest
                  query = list(q = search_terms, 
                               fields = "id,title,artist_display,date_display",
                               size = 10)) # query allows us to specify parameters, which we find in the API documentation

# parse the content returned from our GET request
json_cat_search <- content(cat_search, "parsed")

# let's inspect our content
json_cat_search
## $preference
## NULL
## 
## $pagination
## $pagination$total
## [1] 7869
## 
## $pagination$limit
## [1] 10
## 
## $pagination$offset
## [1] 0
## 
## $pagination$total_pages
## [1] 787
## 
## $pagination$current_page
## [1] 1
## 
## 
## $data
## $data[[1]]
## $data[[1]]$`_score`
## [1] 135.879
## 
## $data[[1]]$id
## [1] 656
## 
## $data[[1]]$title
## [1] "Lion (One of a Pair, South Pedestal)"
## 
## $data[[1]]$date_display
## [1] "1893"
## 
## $data[[1]]$artist_display
## [1] "Edward Kemeys (American, 1843–1907)\nAmerican Bronze Founding Company\nChicago"
## 
## 
## $data[[2]]
## $data[[2]]$`_score`
## [1] 120.4231
## 
## $data[[2]]$id
## [1] 117241
## 
## $data[[2]]$title
## [1] "Girl with Cat"
## 
## $data[[2]]$date_display
## [1] "1937"
## 
## $data[[2]]$artist_display
## [1] "Balthus (Baltusz Klossowski de Rola)\nFrench, 1908–2001"
## 
## 
## $data[[3]]
## $data[[3]]$`_score`
## [1] 116.9664
## 
## $data[[3]]$id
## [1] 45259
## 
## $data[[3]]$title
## [1] "Nude with Cats"
## 
## $data[[3]]$date_display
## [1] "1901"
## 
## $data[[3]]$artist_display
## [1] "Pablo Picasso\nSpanish, active France, 1881-1973"
## 
## 
## $data[[4]]
## $data[[4]]$`_score`
## [1] 104.4088
## 
## $data[[4]]$id
## [1] 16227
## 
## $data[[4]]$title
## [1] "Cat Making Up"
## 
## $data[[4]]$date_display
## [1] "1962"
## 
## $data[[4]]$artist_display
## [1] "Inagaki Tomoo\nJapanese, 1902–1980"
## 
## 
## $data[[5]]
## $data[[5]]$`_score`
## [1] 98.11973
## 
## $data[[5]]$id
## [1] 22482
## 
## $data[[5]]$title
## [1] "Homesickness"
## 
## $data[[5]]$date_display
## [1] "c. 1948"
## 
## $data[[5]]$artist_display
## [1] "René Magritte\nBelgian, 1898-1967"
## 
## 
## $data[[6]]
## $data[[6]]$`_score`
## [1] 94.25226
## 
## $data[[6]]$id
## [1] 51719
## 
## $data[[6]]$title
## [1] "Winter: Cat on a Cushion"
## 
## $data[[6]]$date_display
## [1] "1909"
## 
## $data[[6]]$artist_display
## [1] "Théophile-Alexandre Steinlen\nFrench, born Switzerland, 1859-1923"
## 
## 
## $data[[7]]
## $data[[7]]$`_score`
## [1] 93.10331
## 
## $data[[7]]$id
## [1] 158921
## 
## $data[[7]]$title
## [1] "Courtesan Playing with a Cat"
## 
## $data[[7]]$date_display
## [1] "c. 1715"
## 
## $data[[7]]$artist_display
## [1] "Kaigetsudo Dohan\nJapanese, active c. 1704-16"
## 
## 
## $data[[8]]
## $data[[8]]$`_score`
## [1] 92.47557
## 
## $data[[8]]$id
## [1] 119335
## 
## $data[[8]]$title
## [1] "Baroque Pearl Mounted as a Cat Holding a Mouse"
## 
## $data[[8]]$date_display
## [1] "17th century"
## 
## $data[[8]]$artist_display
## [1] "Spanish or south German"
## 
## 
## $data[[9]]
## $data[[9]]$`_score`
## [1] 92.40161
## 
## $data[[9]]$id
## [1] 68825
## 
## $data[[9]]$title
## [1] "The Cats' Rendezvous"
## 
## $data[[9]]$date_display
## [1] "1868"
## 
## $data[[9]]$artist_display
## [1] "Édouard Manet\nFrench, 1832-1883"
## 
## 
## $data[[10]]
## $data[[10]]$`_score`
## [1] 90.12851
## 
## $data[[10]]$id
## [1] 9372
## 
## $data[[10]]$title
## [1] "The Large Cat"
## 
## $data[[10]]$date_display
## [1] "1657"
## 
## $data[[10]]$artist_display
## [1] "Cornelis Visscher \nDutch, c. 1629-1658"
## 
## 
## 
## $info
## $info$license_text
## [1] "The `description` field in this response is licensed under a Creative Commons Attribution 4.0 Generic License (CC-By) and the Terms and Conditions of artic.edu. All other data in this response is licensed under a Creative Commons Zero (CC0) 1.0 designation and the Terms and Conditions of artic.edu."
## 
## $info$license_links
## $info$license_links[[1]]
## [1] "https://creativecommons.org/publicdomain/zero/1.0/"
## 
## $info$license_links[[2]]
## [1] "https://www.artic.edu/terms"
## 
## 
## $info$version
## [1] "1.13"
## 
## 
## $config
## $config$iiif_url
## [1] "https://www.artic.edu/iiif/2"
## 
## $config$website_url
## [1] "http://www.artic.edu"
# not so useful! so let's see what we got in a slightly easier way...
names(json_cat_search) 
## [1] "preference" "pagination" "data"       "info"       "config"
# $data is what we want. so let's use do.call, rbind, and lapply to extract all the data from our returned content, and format it as a tidy tibble
cat_art <- do.call(rbind, lapply(json_cat_search$data, as_tibble, stringsAsFactors = FALSE)) %>%
  select(- '_score') # removing the search score, but you can keep it if interesting to you

# let's look at our tibble
cat_art
## # A tibble: 10 × 4
##        id title                                      date_display artist_display
##     <int> <chr>                                      <chr>        <chr>         
##  1    656 Lion (One of a Pair, South Pedestal)       1893         "Edward Kemey…
##  2 117241 Girl with Cat                              1937         "Balthus (Bal…
##  3  45259 Nude with Cats                             1901         "Pablo Picass…
##  4  16227 Cat Making Up                              1962         "Inagaki Tomo…
##  5  22482 Homesickness                               c. 1948      "René Magritt…
##  6  51719 Winter: Cat on a Cushion                   1909         "Théophile-Al…
##  7 158921 Courtesan Playing with a Cat               c. 1715      "Kaigetsudo D…
##  8 119335 Baroque Pearl Mounted as a Cat Holding a … 17th century "Spanish or s…
##  9  68825 The Cats' Rendezvous                       1868         "Édouard Mane…
## 10   9372 The Large Cat                              1657         "Cornelis Vis…

So far we have used the AIC API to extract information about the collection and its artworks. That’s nice, but there’s more interesting things we can do. The AIC supports a second – different – API that allows us to download .jpeg copies of their artwork. We’re now going to learn how to download and visualise these images in R.

First, we have to retrieve from the default API (but using the images model) the image id (not the same as the artwork id!) for the pieces of interest. Then we can query the alternative API to retrieve the actual images.

# query the API:
cat_image_search <- GET(artworks_search_url, # the API endpoint of interest
                    query = list(q = search_terms, 
                                 fields = "title, image_id",
                                 size = 1)) # query allows us to specify parameters, which we find in the API documentation
json_cat_image_search <- content(cat_image_search, "parsed")

# directly extract the image id (as we are just working with one request, we don't need to worry about flattening the data)
cat_image_id <- json_cat_image_search$data[[1]]$image_id

# now, we introduce our alternative API, the AIC's IIIF (International Image Interoperability Framework) API
iiif_url <- "https://www.artic.edu/iiif/2"

# using our iiif_url and our cat_image_id, plus some formatting as provided by the AIC API documentation, we get
iiif_url_artwork <- paste0(iiif_url, "/", cat_image_id, "/full/843,/0/default.jpg")

# assign an empty temporary file to store our downloaded image in this R session (in a moment we will save these locally, when we do a retrieve of images)
temp <- tempfile()

# download the file from our API URL
download.file(iiif_url_artwork, temp, mode="wb")

#Reading the file from the temp object
image_to_plot <- readJPEG(temp)

class(image_to_plot)
## [1] "array"
# plot our image, using ggplot (can also use base R)
ggplot() +
  annotation_custom(rasterGrob(image_to_plot), xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf) +
  theme_void() +
  theme(plot.margin = unit(rep(0, 4), "null"))

Finally, let’s build a piece of code for requesting and plotting artwork from the AIC, using any set of search terms we want.

# let's build a function
art_image_search <- function(search_term, n_images = 5, output_dir = "temp_images", clear_directory = TRUE, plot_images = TRUE) {

  search_term <- gsub(" ", "%20", search_term)
  
  images_search_url <- "https://api.artic.edu/api/v1/artworks/search?q="
  images_search_out <- GET(images_search_url, # the API endpoint of interest
                       query = list(q = search_term, 
                                    fields = "id, title, artist_display, image_id",
                                    size = n_images)) # query allows us to specify parameters, which we find in the API documentation
  json_images_search_out <- content(images_search_out, "parsed")
  
  # replace NULL values with NA values
  json_images_search_out$data <- eval(parse(text = gsub("NULL", "NA", deparse(json_images_search_out$data))))
  
  image_ids <- do.call(rbind, lapply(json_images_search_out$data, as_tibble, stringsAsFactors = FALSE)) %>%
                dplyr::select('id', 'title', 'artist_display', 'image_id') 

  # we now check if our output directory exists. if not, we create it. if it does and we want to clear the directory, we do so. else, proceed.
  if (!dir.exists(paste0("./",output_dir))) {
    dir.create(paste0("./",output_dir))
  } else if(dir.exists(paste0("./",output_dir)) & clear_directory == TRUE) {
      unlink(paste0("./",output_dir), recursive = TRUE, force = TRUE)
      dir.create(paste0("./",output_dir))
  } else {}

    # now move to image API query
  iiif_url <- "https://www.artic.edu/iiif/2"
  
  # now work through the image ids, with api queries:
  for(i in 1:nrow(image_ids)){
    
    file <- paste0("./", output_dir, "/", image_ids$id[i], ".jpg")
    
    # try() here allows our request to fail without interrupting the run
    try(download.file(paste0(iiif_url, "/", image_ids$image_id[i], "/full/843,/0/default.jpg"), 
                  file, mode="wb"))
    
    # take a breath
    Sys.sleep(1)
    
  }
  
  # enumerate our successfully downloaded files
  downloads <- list.files(paste0("./", output_dir))
  
  # now, if we want to plot images, we save them to a list of ggplots
  if (plot_images == TRUE){
      
    images <- list()
    
      for(j in 1:length(downloads)){
     
      image_to_plot <- readJPEG(paste0("./", output_dir,"/", downloads[j]))
      
      id <- gsub(".jpg", "", downloads[j])
        
      artist <- image_ids$artist_display[image_ids$id==id]
        
      title <- image_ids$title[image_ids$id==id]

      title_for_image <- paste0(title, " by ", artist)
      
      images[[j]] <- ggplot() +
                      annotation_custom(rasterGrob(image_to_plot), xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf) + 
                      ggtitle(str_wrap(title_for_image, 80)) +
                      theme_void() 
    
      } 
    
    # return the object
    return(images)
    
  } else {}
  
}

modern_art_images <- art_image_search("modern art", 10)

modern_art_images
## [[1]]

## 
## [[2]]

## 
## [[3]]

## 
## [[4]]

## 
## [[5]]

## 
## [[6]]

## 
## [[7]]

## 
## [[8]]

## 
## [[9]]

## 
## [[10]]